skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Ye, Qing"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The soaring drug overdose crisis in the United States has claimed more than half a million lives in the past decade and remains a major public health threat. The ability to predict drug overdose deaths at the county level can help local communities develop action plans in response to emerging changes. Applying off-the-shelf machine learning algorithms for prediction can be challenging due to the heterogeneous risk profiles of the counties and suppressed data in common publicly available data sources. To fill these gaps, we develop a cluster-aware supervised learning (CASL) framework to enhance the prediction of county-level drug overdose deaths. This CASL model simultaneously clusters counties into groups based on geographical and socioeconomic characteristics and minimizes the loss function that accounts for suppressed values and cluster-specific regularization. Our computational study uses real-world data from 2010 to 2021, focusing on the ten states most severely impacted by the drug overdose crisis. The results demonstrate that our proposed CASL framework significantly outperforms state-of-the-art methods by achieving a superior balance in prediction accuracy for both unsuppressed and suppressed observations. The proposed model also identifies different clusters of counties, capturing heterogeneous patterns of overdose mortality among counties of diverse characteristics. 
    more » « less
    Free, publicly-accessible full text available April 11, 2026
  2. Free, publicly-accessible full text available July 7, 2026
  3. Abstract Emerging studies underscore the promising capabilities of large language model-based chatbots in conducting basic bioinformatics data analyses. The recent feature of accepting image inputs by ChatGPT, also known as GPT-4V(ision), motivated us to explore its efficacy in deciphering bioinformatics scientific figures. Our evaluation with examples in cancer research, including sequencing data analysis, multimodal network-based drug repositioning, and tumor clonal evolution, revealed that ChatGPT can proficiently explain different plot types and apply biological knowledge to enrich interpretations. However, it struggled to provide accurate interpretations when color perception and quantitative analysis of visual elements were involved. Furthermore, while the chatbot can draft figure legends and summarize findings from the figures, stringent proofreading is imperative to ensure the accuracy and reliability of the content. 
    more » « less
    Free, publicly-accessible full text available December 1, 2025
  4. There are insufficient accurate biomarkers and effective therapeutic targets in current cancer treatment. Multi-omics regulatory networks in patient bulk tumors and single cells can shed light on molecular disease mechanisms. Integration of multi-omics data with large-scale patient electronic medical records (EMRs) can lead to the discovery of biomarkers and therapeutic targets. In this review, multi-omics data harmonization methods were introduced, and common approaches to molecular network inference were summarized. Our Prediction Logic Boolean Implication Networks (PLBINs) have advantages over other methods in constructing genome-scale multi-omics networks in bulk tumors and single cells in terms of computational efficiency, scalability, and accuracy. Based on the constructed multi-modal regulatory networks, graph theory network centrality metrics can be used in the prioritization of candidates for discovering biomarkers and therapeutic targets. Our approach to integrating multi-omics profiles in a patient cohort with large-scale patient EMRs such as the SEER-Medicare cancer registry combined with extensive external validation can identify potential biomarkers applicable in large patient populations. These methodologies form a conceptually innovative framework to analyze various available information from research laboratories and healthcare systems, accelerating the discovery of biomarkers and therapeutic targets to ultimately improve cancer patient survival outcomes. 
    more » « less
  5. There are currently no accurate biomarkers for optimal treatment selection in early-stage non-small cell lung cancer (NSCLC). Novel therapeutic targets are needed to improve NSCLC survival outcomes. This study systematically evaluated the association between genome-scale regulatory network centralities and NSCLC tumorigenesis, proliferation, and survival in early-stage NSCLC patients. Boolean implication networks were used to construct multimodal networks using patient DNA copy number variation, mRNA, and protein expression profiles. T statistics of differential gene/protein expression in tumors versus non-cancerous adjacent tissues, dependency scores in in vitro CRISPR-Cas9/RNA interference (RNAi) screening of human NSCLC cell lines, and hazard ratios in univariate Cox modeling of the Cancer Genome Atlas (TCGA) NSCLC patients were correlated with graph theory centrality metrics. Hub genes in multi-omics networks involving gene/protein expression were associated with oncogenic, proliferative potentials and poor patient survival outcomes (p < 0.05, Pearson’s correlation). Immunotherapy targets PD1, PDL1, CTLA4, and CD27 were ranked as top hub genes within the 10th percentile in most constructed multi-omics networks. BUB3, DNM1L, EIF2S1, KPNB1, NMT1, PGAM1, and STRAP were discovered as important hub genes in NSCLC proliferation with oncogenic potential. These results support the importance of hub genes in NSCLC tumorigenesis, proliferation, and prognosis, with implications in prioritizing therapeutic targets to improve patient survival outcomes. 
    more » « less
  6. Breast cancer treatment can be improved with biomarkers for early detection and individualized therapy. A set of 86 microRNAs (miRNAs) were identified to separate breast cancer tumors from normal breast tissues (n = 52) with an overall accuracy of 90.4%. Six miRNAs had concordant expression in both tumors and breast cancer patient blood samples compared with the normal control samples. Twelve miRNAs showed concordant expression in tumors vs. normal breast tissues and patient survival (n = 1093), with seven as potential tumor suppressors and five as potential oncomiRs. From experimentally validated target genes of these 86 miRNAs, pan-sensitive and pan-resistant genes with concordant mRNA and protein expression associated with in-vitro drug response to 19 NCCN-recommended breast cancer drugs were selected. Combined with in-vitro proliferation assays using CRISPR-Cas9/RNAi and patient survival analysis, MEK inhibitors PD19830 and BRD-K12244279, pilocarpine, and tremorine were discovered as potential new drug options for treating breast cancer. Multi-omics biomarkers of response to the discovered drugs were identified using human breast cancer cell lines. This study presented an artificial intelligence pipeline of miRNA-based discovery of biomarkers, therapeutic targets, and repositioning drugs that can be applied to many cancer types. 
    more » « less
  7. Isoprene affects new particle formation rates in environments and experiments also containing monoterpenes. 
    more » « less
  8. The majority of lung cancer patients are diagnosed with metastatic disease. This study identified a set of 73 microRNAs (miRNAs) that classified lung cancer tumors from normal lung tissues with an overall accuracy of 96.3% in the training patient cohort (n = 109) and 91.7% in unsupervised classification and 92.3% in supervised classification in the validation set (n = 375). Based on association with patient survival (n = 1016), 10 miRNAs were identified as potential tumor suppressors (hsa-miR-144, hsa-miR-195, hsa-miR-223, hsa-miR-30a, hsa-miR-30b, hsa-miR-30d, hsa-miR-335, hsa-miR-363, hsa-miR-451, and hsa-miR-99a), and 4 were identified as potential oncogenes (hsa-miR-21, hsa-miR-31, hsa-miR-411, and hsa-miR-494) in lung cancer. Experimentally confirmed target genes were identified for the 73 diagnostic miRNAs, from which proliferation genes were selected from CRISPR-Cas9/RNA interference (RNAi) screening assays. Pansensitive and panresistant genes to 21 NCCN-recommended drugs with concordant mRNA and protein expression were identified. DGKE and WDR47 were found with significant associations with responses to both systemic therapies and radiotherapy in lung cancer. Based on our identified miRNA-regulated molecular machinery, an inhibitor of PDK1/Akt BX-912, an anthracycline antibiotic daunorubicin, and a multi-targeted protein kinase inhibitor midostaurin were discovered as potential repositioning drugs for treating lung cancer. These findings have implications for improving lung cancer diagnosis, optimizing treatment selection, and discovering new drug options for better patient outcomes. 
    more » « less
  9. In NSCLC, there is a pressing need for immunotherapy predictive biomarkers. The processes underlying B-cell dysfunction, as well as their prognostic importance in NSCLC, are unknown. Tumor-specific B-cell gene co-expression networks were constructed by comparing the Boolean implication modeling of single-cell RNA sequencing of NSCLC tumor B cells and normal B cells. Proliferation genes were selected from the networks using in vitro CRISPR-Cas9/RNA interfering (RNAi) screening data in more than 92 human NSCLC epithelial cell lines. The prognostic and predictive evaluation was performed using public NSCLC transcriptome and proteome profiles. A B cell proliferation and prognostic gene co-expression network was present only in normal lung B cells and missing in NSCLC tumor B cells. A nine-gene signature was identified from this B cell network that provided accurate prognostic stratification using bulk NSCLC tumor transcriptome (n = 1313) and proteome profiles (n = 103). Multiple genes (HLA-DRA, HLA-DRB1, OAS1, and CD74) differentially expressed in NSCLC B cells, peripheral blood lymphocytes, and tumor T cells had concordant prognostic indications at the mRNA and protein expression levels. The selected genes were associated with drug sensitivity/resistance to 10 commonly used NSCLC therapeutic regimens. Lestaurtinib was discovered as a potential repositioning drug for treating NSCLC. 
    more » « less
  10. There is currently no gene expression assay that can assess if premalignant lesions will develop into invasive breast cancer. This study sought to identify biomarkers for selecting patients with a high potential for developing invasive carcinoma in the breast with normal histology, benign lesions, or premalignant lesions. A set of 26-gene mRNA expression profiles were used to identify invasive ductal carcinomas from histologically normal tissue and benign lesions and to select those with a higher potential for future cancer development (ADHC) in the breast associated with atypical ductal hyperplasia (ADH). The expression-defined model achieved an overall accuracy of 94.05% (AUC = 0.96) in classifying invasive ductal carcinomas from histologically normal tissue and benign lesions (n = 185). This gene signature classified cancer development in ADH tissues with an overall accuracy of 100% (n = 8). The mRNA expression patterns of these 26 genes were validated using RT-PCR analyses of independent tissue samples (n = 77) and blood samples (n = 48). The protein expression of PBX2 and RAD52 assessed with immunohistochemistry were prognostic of breast cancer survival outcomes. This signature provided significant prognostic stratification in The Cancer Genome Atlas breast cancer patients (n = 1100), as well as basal-like and luminal A subtypes, and was associated with distinct immune infiltration and activities. The mRNA and protein expression of the 26 genes was associated with sensitivity or resistance to 18 NCCN-recommended drugs for treating breast cancer. Eleven genes had significant proliferative potential in CRISPR-Cas9/RNAi screening. Based on this gene expression signature, the VEGFR inhibitor ZM-306416 was discovered as a new drug for treating breast cancer. 
    more » « less